In this post, I am exploring network analysis techniques in a family network of major characters from Game of Thrones.


What is a network?

A network in this context is a graph of interconnected nodes/vertices. Nodes can e.g. be people in a social network, genes in a co-expression network, etc. Nodes are connected via ties/edges.


What can network analysis tell us?

Network analysis can e.g. be used to explore relationships in social or professional networks. There, we would typically ask questions like:

These answers can give us a lot of information about the patterns of how people interact. nternational Network for Social Network Analysis, ‘Social network analysis is focused on uncovering the patterning of people’s interaction’.


The Game of Thrones character network

The basis for this network is Kaggle’s Game of Throne dataset but because most family relationships were missing in that dataset, I added the missing information in part by hand (based on A Wiki of Ice and Fire) and by scraping information from the Game of Thrones wiki. You can find the full code for how I generated the network on my Github page.

library(tidyverse)
library(igraph)
library(statnet)
load("union_edges.RData")
load("union_characters.RData")


I am using igraph to plot the network. To do so, I first create the graph from the edge- and nodetable. An edgetable contains source and target nodes in the first two columns and optinally additional columns with edge attributes. Here, I have the type of interaction (mother, father or spouse), the color and linetype I want to assign to each edge.

Because the books and the TV series differ slightly, I have introduced edges that are only supported or hinted at by the TV series and are not part of the original narrative in the books. These edges are marked by being dotted instead of solid. An additional color for edges with unspecified parental origin are introduced as well. Originally, these served for interactions that were extracted from character names (i.e. characters that ended with “… son/daughter of …”) and could either mean mother or father. Now, they show unclear parentage or cases where there are a biological and a de facto father, as in the case of Jon Snow.

head(union_edges)
##               source            target   type   color   lty
## 1         Lysa Arryn      Robert Arryn mother #7570B3 solid
## 2       Jasper Arryn        Alys Arryn father #1B9E77 solid
## 3       Jasper Arryn         Jon Arryn father #1B9E77 solid
## 4          Jon Arryn      Robert Arryn father #1B9E77 solid
## 110 Cersei Lannister  Tommen Baratheon mother #7570B3 solid
## 210 Cersei Lannister Joffrey Baratheon mother #7570B3 solid

The nodetable contains one row for each character that is either a source or a target in the edgetable. We can give any number and type of node attribute but here, I chose the followin columns from the original Kaggle dataset: gender/male (male = 1, female = 0), house (as the house each character was born into) and popularity. House2 was meant to assign a color to only the major houses. Shape represents the gender.

head(union_characters)
##            name male culture          house popularity      house2   color
## 1    Alys Arryn    0    <NA>    House Arryn 0.08026756        <NA>    <NA>
## 2 Elys Waynwood    0    <NA> House Waynwood 0.07023411        <NA>    <NA>
## 3  Jasper Arryn    1    <NA>    House Arryn 0.04347826        <NA>    <NA>
## 4   Jeyne Royce    0    <NA>    House Royce 0.00000000        <NA>    <NA>
## 5     Jon Arryn    1 Valemen    House Arryn 0.83612040        <NA>    <NA>
## 6    Lysa Arryn    0    <NA>    House Tully 0.00000000 House Tully #F781BF
##    shape
## 1 circle
## 2 circle
## 3 square
## 4 circle
## 5 square
## 6 circle

By default, we have a directed graph.

union_graph <- graph_from_data_frame(union_edges, directed = TRUE, vertices = union_characters)

For plotting the legend, I am summarising the edge and node colors.

color_vertices <- union_characters %>%
  group_by(house, color) %>%
  summarise(n = n()) %>%
  filter(!is.na(color))

colors_edges <- union_edges %>%
  group_by(type, color) %>%
  summarise(n = n()) %>%
  filter(!is.na(color))

Now, we can plot the graph object (here with Fruchterman-Reingold layout):

layout <- layout_with_fr(union_graph)
## png 
##   2
plot(union_graph,
     layout = layout,
     vertex.label = gsub(" ", "\n", V(union_graph)$name),
     vertex.shape = V(union_graph)$shape,
     vertex.color = V(union_graph)$color, 
     vertex.size = (V(union_graph)$popularity + 0.5) * 5, 
     vertex.frame.color = "gray", 
     vertex.label.color = "black", 
     vertex.label.cex = 0.8,
     edge.arrow.size = 0.5,
     edge.color = E(union_graph)$color,
     edge.lty = E(union_graph)$lty)
legend("topleft", legend = c("Node color:", as.character(color_vertices$house), NA, "Edge color:", as.character(colors_edges$type)), pch = 19,
       col = c(NA, color_vertices$color, NA, NA, colors_edges$color), pt.cex = 2, cex = 1, bty = "n", ncol = 1)

Node color shows the major houses, node size the character’s popularity and node shape their gender (square for male, circle for female). Edge color shows interaction type.

As we can see, even with only a subset of characters from the Game of Thrones world, the network is already quite big. You can click on the image to open the pdf and zoom into specific parts of the plot and read the node labels/character names.

What we can see right away is that there are only limited connections between houses and that the Greyjoys are the only house that has no ties to any of the others.


How do we find out who the most important character(s) is/are in this network? We consider a character “important” if he has connections to many other characters.

There are a few network properties, that tell us more about this. For most of them, I am considering the network as undirected to account for parent/child relationships as being mutual.

union_graph_undir <- as.undirected(union_graph, mode = "collapse")


Centrality

Centrality describes the number of edges that are in- or outgoing to/from nodes.

“Centralization is a method for creating a graph level centralization measure from the centrality scores of the vertices.” centralize() help

centr_degree(union_graph_undir, mode = "total")$centralization
## [1] 0.04282795
centr_clo(union_graph_undir, mode = "total")$centralization
## [1] 0.01414082
centr_eigen(union_graph_undir, directed = FALSE)$centralization
## [1] 0.8787532


Node degree

Node degree or degree centrality describes how central a node is in the network (i.e. how many in- and outgoing edges it has or to how many other nodes it is directly connected via one edge).

“The degree of a vertex is its most basic structural property, the number of its adjacent edges.” From the help pages of degree()

We can calculate the number of out- or ingoing edges of each node, or - as I am doing here - the sum of both.

union_graph_undir_degree <- igraph::degree(union_graph_undir, mode = "total")

#standardized by number of nodes
union_graph_undir_degree_std <- union_graph_undir_degree / (vcount(union_graph_undir) - 1)
node_degree <- data.frame(degree = union_graph_undir_degree,
                          degree_std = union_graph_undir_degree_std) %>%
  tibble::rownames_to_column()

union_characters <- left_join(union_characters, node_degree, by = c("name" = "rowname"))

node_degree %>%
  arrange(-degree) %>%
  .[1:10, ]
##            rowname degree degree_std
## 1  Quellon Greyjoy     12 0.05797101
## 2      Walder Frey     10 0.04830918
## 3   Oberyn Martell     10 0.04830918
## 4     Eddard Stark      9 0.04347826
## 5    Catelyn Stark      8 0.03864734
## 6       Emmon Frey      7 0.03381643
## 7  Genna Lannister      7 0.03381643
## 8     Merrett Frey      7 0.03381643
## 9    Balon Greyjoy      7 0.03381643
## 10 Jason Lannister      7 0.03381643

In this case, the node degree tells us how many offspring and spouses a character had. With 3 wifes and several children, Quellon Greyjoy, the grandfather of Theon and Asha/Yara comes out on top (of course, had I included all offspring and wifes of Walder Frey’s, he would easily be on top but the network would have gotten infintely more confusing).


Closeness

The closeness of a node describes its distance to all other nodes. A node with highest closeness is more central and can spread information to many other nodes.

closeness <- igraph::closeness(union_graph_undir, mode = "total")

#standardized by number of nodes
closeness_std <- closeness / (vcount(union_graph_undir) - 1)
node_closeness <- data.frame(closeness = closeness,
                          closeness_std = closeness_std) %>%
  tibble::rownames_to_column()

union_characters <- left_join(union_characters, node_closeness, by = c("name" = "rowname"))

node_closeness %>%
  arrange(-closeness) %>%
  .[1:10, ]
##             rowname    closeness closeness_std
## 1       Sansa Stark 0.0002013288  9.726028e-07
## 2  Tyrion Lannister 0.0002012882  9.724070e-07
## 3   Tywin Lannister 0.0002011668  9.718201e-07
## 4  Joanna Lannister 0.0002005616  9.688965e-07
## 5      Eddard Stark 0.0002002804  9.675381e-07
## 6     Catelyn Stark 0.0001986492  9.596579e-07
## 7  Cersei Lannister 0.0001984915  9.588960e-07
## 8   Jaime Lannister 0.0001975894  9.545382e-07
## 9    Jeyne Marbrand 0.0001966568  9.500330e-07
## 10  Tytos Lannister 0.0001966568  9.500330e-07


Betweenness centrality

Betweenness describes the number of shortest paths between nodes. Nodes with high betweenness centrality are on the path between many other nodes, i.e. they are people who are key connections or bridges between different groups of nodes. In a social network, these nodes would be very important e.g. as targets for advertisement, because they are likely to pass on information to a wide reach of people.

The igraph function betweenness() calculates vertex betweenness, edge_betweenness() calculates edge betweenness:

“The vertex and edge betweenness are (roughly) defined by the number of geodesics (shortest paths) going through a vertex or an edge.” igraph help for estimate_betweenness()

betweenness <- igraph::betweenness(union_graph_undir, directed = FALSE)

# standardize by number of node pairs
betweenness_std <- betweenness / ((vcount(union_graph_undir) - 1) * (vcount(union_graph_undir) - 2))

node_betweenness <- data.frame(betweenness = betweenness,
                               betweenness_std = betweenness_std) %>%
  tibble::rownames_to_column() 

union_characters <- left_join(union_characters, node_betweenness, by = c("name" = "rowname"))

node_betweenness %>%
  arrange(-betweenness) %>%
  .[1:10, ]
##              rowname betweenness betweenness_std
## 1       Eddard Stark    6926.864      0.16244229
## 2        Sansa Stark    6165.667      0.14459140
## 3   Tyrion Lannister    5617.482      0.13173590
## 4    Tywin Lannister    5070.395      0.11890613
## 5   Joanna Lannister    4737.524      0.11109996
## 6  Rhaegar Targaryen    4301.583      0.10087667
## 7    Margaery Tyrell    4016.417      0.09418922
## 8           Jon Snow    3558.884      0.08345958
## 9        Mace Tyrell    3392.500      0.07955771
## 10   Jason Lannister    3068.500      0.07195957
edge_betweenness <- igraph::edge_betweenness(union_graph_undir, directed = FALSE)

data.frame(edge = attr(E(union_graph_undir), "vnames"),
           betweenness = edge_betweenness) %>%
  tibble::rownames_to_column() %>%
  arrange(-betweenness) %>%
  .[1:10, ]
##    rowname                              edge betweenness
## 1      160      Sansa Stark|Tyrion Lannister    5604.149
## 2      207          Sansa Stark|Eddard Stark    4709.852
## 3      212        Rhaegar Targaryen|Jon Snow    3560.083
## 4      296       Margaery Tyrell|Mace Tyrell    3465.000
## 5      213             Eddard Stark|Jon Snow    3163.048
## 6      131  Jason Lannister|Joanna Lannister    3089.500
## 7      159 Joanna Lannister|Tyrion Lannister    2983.591
## 8      171  Tyrion Lannister|Tywin Lannister    2647.224
## 9      192    Elia Martell|Rhaegar Targaryen    2580.000
## 10     300         Luthor Tyrell|Mace Tyrell    2565.000

This, we can now plot by feeding the node betweenness as vertex.size and edge betweenness as edge.width to our plot function:

plot(union_graph_undir,
     layout = layout,
     vertex.label = gsub(" ", "\n", V(union_graph_undir)$name),
     vertex.shape = V(union_graph_undir)$shape,
     vertex.color = V(union_graph_undir)$color, 
     vertex.size = betweenness * 0.001, 
     vertex.frame.color = "gray", 
     vertex.label.color = "black", 
     vertex.label.cex = 0.8,
     edge.width = edge_betweenness * 0.01,
     edge.arrow.size = 0.5,
     edge.color = E(union_graph_undir)$color,
     edge.lty = E(union_graph_undir)$lty)
legend("topleft", legend = c("Node color:", as.character(color_vertices$house), NA, "Edge color:", as.character(colors_edges$type)), pch = 19,
       col = c(NA, color_vertices$color, NA, NA, colors_edges$color), pt.cex = 2, cex = 1, bty = "n", ncol = 1)

Ned Stark is the character with highest betweenness. This makes sense, as he and his children have connections to many of the other houses and are the central points from which the story unfolds. However, we have to keep in mind here, that my choice of who is important enough to include in the network (e.g. the Stark ancestors) and who not (e.g. the whole complicated mess that is the Targaryen and Frey family tree) makes this result somewhat biased.


Diameter

In contrast to the shortest path between two nodes, we can also calculate the longest path, or diameter:

diameter(union_graph_undir, directed = FALSE)
## [1] 21

“get_diameter returns a path with the actual diameter. If there are many shortest paths of the length of the diameter, then it returns the first one found.” diameter() help

This, we can also plot:

union_graph_undir_diameter <- union_graph_undir
node_diameter <- get.diameter(union_graph_undir_diameter,  directed = FALSE)

V(union_graph_undir_diameter)$color <- scales::alpha(V(union_graph_undir_diameter)$color, alpha = 0.5)
V(union_graph_undir_diameter)$size <- 2

V(union_graph_undir_diameter)[node_diameter]$color <- "red"
V(union_graph_undir_diameter)[node_diameter]$size <- 5

E(union_graph_undir_diameter)$color <- "grey"
E(union_graph_undir_diameter)$width <- 1

E(union_graph_undir_diameter, path = node_diameter)$color <- "red"
E(union_graph_undir_diameter, path = node_diameter)$width <- 5

plot(union_graph_undir_diameter,
     layout = layout,
     vertex.label = gsub(" ", "\n", V(union_graph_undir_diameter)$name),
     vertex.shape = V(union_graph_undir_diameter)$shape,
     vertex.frame.color = "gray", 
     vertex.label.color = "black", 
     vertex.label.cex = 0.8,
     edge.arrow.size = 0.5,
     edge.lty = E(union_graph_undir_diameter)$lty)
legend("topleft", legend = c("Node color:", as.character(color_vertices$house), NA, "Edge color:", as.character(colors_edges$type)), pch = 19,
       col = c(NA, color_vertices$color, NA, NA, colors_edges$color), pt.cex = 2, cex = 1, bty = "n", ncol = 1)


Transitivity

“Transitivity measures the probability that the adjacent vertices of a vertex are connected. This is sometimes also called the clustering coefficient.” transitivity() help

We can calculate the transitivity or ratio of triangles to connected triples for the whole network:

transitivity(union_graph_undir, type = "global")
## [1] 0.2850679

Or for each node:

transitivity <- data.frame(name = V(union_graph_undir)$name,
      transitivity = transitivity(union_graph_undir, type = "local")) %>%
  mutate(name = as.character(name))

union_characters <- left_join(union_characters, transitivity, by = "name")

transitivity %>%
  arrange(-transitivity) %>%
  .[1:10, ]
##                 name transitivity
## 1       Robert Arryn            1
## 2   Ormund Baratheon            1
## 3     Selyse Florent            1
## 4  Shireen Baratheon            1
## 5   Amarei Crakehall            1
## 6       Marissa Frey            1
## 7        Olyvar Frey            1
## 8        Perra Royce            1
## 9        Perwyn Frey            1
## 10         Tion Frey            1


PageRank centrality

PageRank (originally from Google) is similar to eigenvector centrality.

page_rank <- page.rank(union_graph_undir, directed = FALSE)

page_rank_centrality <- data.frame(name = names(page_rank$vector),
      page_rank = page_rank$vector) %>%
  mutate(name = as.character(name))

union_characters <- left_join(union_characters, page_rank_centrality, by = "name")

page_rank_centrality %>%
  arrange(-page_rank) %>%
  .[1:10, ]
##                 name   page_rank
## 1     Oberyn Martell 0.018402407
## 2    Quellon Greyjoy 0.016128129
## 3        Walder Frey 0.012956029
## 4       Eddard Stark 0.011725019
## 5       Cregan Stark 0.010983561
## 6      Catelyn Stark 0.010555473
## 7       Lyarra Stark 0.009876629
## 8  Aegon V Targaryen 0.009688458
## 9      Balon Greyjoy 0.009647049
## 10         Jon Arryn 0.009623742


Matrix representation of a network

Connections between nodes can also be represented as an adjacency matrix. We can convert our graph object to an adjacency matrix with igraph’s as_adjacency_matrix() function. Whenever there is an edge between two nodes, this field in the matrix will get assigned a 1, otherwise it is 0.

adjacency <- as.matrix(as_adjacency_matrix(union_graph_undir))


Eigenvector centrality

We can also calculate the eigenvalues and eigenvectors of the adjacency matrix.

#degree diagonal matrix
degree_diag <- diag(1 / igraph::degree(union_graph_undir))

# PageRank matrix
pagerank <- adjacency %*% degree_diag

eigenvalues <- eigen(pagerank)

The eigenvector with the highest eigenvalue scores those vertices highly, that have many eges or that are connected to vertices with many edges.

eigenvector <- data.frame(name = rownames(pagerank),
           eigenvector = as.numeric(eigenvalues$vectors[, which.max(eigenvalues$values)]))

union_characters <- left_join(union_characters, eigenvector, by = "name")

eigenvector %>%
  arrange(-eigenvector) %>%
  .[1:10, ]
##                          name   eigenvector
## 1          Mellario of Norvos -0.0001772104
## 2               Robyn Ryswell -0.0001772104
## 3             Gilliane Glover -0.0001772104
## 4  Rickon Stark son of Benjen -0.0001772104
## 5                 Arra Norrey -0.0001772104
## 6                Sarella Sand -0.0001772104
## 7                Nymeria Sand -0.0001772104
## 8                  Obara Sand -0.0001772104
## 9                  Tyene Sand -0.0001772104
## 10                 Dorea Sand -0.0001772104

We can find the eigenvector centrality scores with:

eigen_centrality <- igraph::eigen_centrality(union_graph_undir, directed = FALSE)

eigen_centrality <- data.frame(name = names(eigen_centrality$vector),
           eigen_centrality = eigen_centrality$vector)

union_characters <- left_join(union_characters, eigen_centrality, eigenvector, by = "name")
## Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
## factor and character vector, coercing into character vector
eigen_centrality %>%
  arrange(-eigen_centrality) %>%
  .[1:10, ]
##                name eigen_centrality
## 1   Tywin Lannister        1.0000000
## 2  Cersei Lannister        0.9168980
## 3  Joanna Lannister        0.8358122
## 4   Tytos Lannister        0.8190076
## 5    Jeyne Marbrand        0.8190076
## 6   Genna Lannister        0.7788376
## 7   Jaime Lannister        0.7642870
## 8  Robert Baratheon        0.7087042
## 9        Emmon Frey        0.6538709
## 10      Walder Frey        0.6516021


union_characters %>%
  ggplot(aes(x = popularity, color = house2, fill = house2)) +
    geom_density(alpha = 0.3) +
    theme_bw()

union_characters %>%
  filter(popularity > 0.75) %>%
  dplyr::select(-contains("_std")) %>%
  gather(x, y, degree:eigen_centrality) %>%
  ggplot(aes(x = name, y = y, color = house2)) +
    geom_point(size = 3) +
    facet_wrap(~ x, ncol = 1, scales = "free") +
    theme_bw() +
    theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))
## Warning: Removed 1 rows containing missing values (geom_point).


Network properties

We can, for example, feed our adjacency matrix to other functions, like GenInd() from the NetIndices packages. This function calculates a number of network properties, like number of compartments (N), total system throughput (T..), total system throughflow (TST), number of internal links (Lint), total number of links (Ltot), like density (LD), connectance (C), average link weight (Tijbar), average compartment throughflow (TSTbar) and compartmentalization or degree of connectedness of subsystems in the network (Cbar).

library(NetIndices)
graph.properties <- GenInd(adjacency)
graph.properties
## $N
## [1] 208
## 
## $T..
## [1] 652
## 
## $TST
## [1] 652
## 
## $Lint
## [1] 652
## 
## $Ltot
## [1] 652
## 
## $LD
## [1] 3.134615
## 
## $C
## [1] 0.01514307
## 
## $Tijbar
## [1] 1
## 
## $TSTbar
## [1] 3.134615
## 
## $Cbar
## [1] 0.01086163


Alternatively, the network package provides additional functions to obtain network properties. Here, we can again feed in the adjacency matrix of our network.

library(network)
adj_network <- network(adjacency, directed = TRUE)

From this network object, we can e.g. get the number of dyads and edges within a network and the network size.

adj_network
##  Network attributes:
##   vertices = 208 
##   directed = TRUE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 652 
##     missing edges= 0 
##     non-missing edges= 652 
## 
##  Vertex attribute names: 
##     vertex.names 
## 
## No edge attributes
network.dyadcount(adj_network)
## [1] 43056
network.edgecount(adj_network)
## [1] 652
network.size(adj_network)
## [1] 208

From the sna package, we can e.g. use functions that tell us the graph density and the dyadic reciprocity of the vertices or edges

gden(adjacency)
## [1] 0.01514307
grecip(adjacency)
## Mut 
##   1
grecip(adjacency, measure = "edgewise")
## Mut 
##   1


Groups of nodes

We can also analyze dyads (pairs of two nodes), triads (groups of three nodes) and bigger cliques in our network. For dyads, we can use the function dyad_census() from igraph or dyad.census() from sna. Both are identical and calculate a Holland and Leinhardt dyad census with

  • mut: The number of pairs with mutual connections (in our case, spouses).
  • asym: The number of pairs with non-mutual connections (in our case, mother-child and father-child relationships).
  • null: The number of pairs with no connection between them.
#igraph::dyad_census(union_graph_undir)
sna::dyad.census(adjacency)
##      Mut Asym  Null
## [1,] 326    0 21202

The same can be calculated for triads (see ?triad_census for details on what each output means). While our original network was plotted as a directed network to show mother-child/father-child relationships as directional for plotting, it makes sense that we consider our network as undirected now that we want to analyze the network probabilities.

#igraph::triad_census(union_graph_undir)
sna::triad.census(adjacency)
##          003 012   102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U
## [1,] 1412100   0 65261    0    0    0    0    0    0    0 790    0    0
##      120C 210 300
## [1,]    0   0 105
triad.classify(adjacency, mode = "graph")
## [1] 2

We can also calculate the number of paths and cycles of any length we specify, here e.g. of length <= 5. For edges, we obtain the sum of counts for all paths or cycles up to the given maximum length. For vertices/nodes, we obtain the number of paths or cycles to which each node belongs.

node_kpath <- kpath.census(adjacency, maxlen = 5, mode = "graph", tabulate.by.vertex = TRUE, dyadic.tabulation = "sum")
edge_kpath <- kpath.census(adjacency, maxlen = 5, mode = "graph", tabulate.by.vertex = FALSE)
edge_kpath
## $path.count
##     1     2     3     4     5 
##   326  1105  2973  7183 17026
gplot(node_kpath$paths.bydyad,
      label.cex = 0.5, 
      vertex.cex = 0.75,
      displaylabels = TRUE,
      edge.col = "grey")

node_kcycle <- kcycle.census(adjacency, maxlen = 5, mode = "graph", tabulate.by.vertex = TRUE, cycle.comembership = "sum")
edge_kcycle <- kcycle.census(adjacency, maxlen = 5, mode = "graph", tabulate.by.vertex = FALSE)
edge_kcycle
## $cycle.count
##   2   3   4   5 
##   0 105 136  27
node_kcycle_reduced <- node_kcycle$cycle.comemb
node_kcycle_reduced <- node_kcycle_reduced[which(rowSums(node_kcycle_reduced) > 0), which(colSums(node_kcycle_reduced) > 0)]

gplot(node_kcycle_reduced,
      label.cex = 0.5, 
      vertex.cex = 0.75,
      displaylabels = TRUE,
      edge.col = "grey")

“A (maximal) clique is a maximal set of mutually adjacenct vertices.” clique.census() help

node_clique <- clique.census(adjacency, mode = "graph", tabulate.by.vertex = TRUE, clique.comembership = "sum")
edge_clique <- clique.census(adjacency, mode = "graph", tabulate.by.vertex = FALSE, clique.comembership = "sum")
edge_clique$clique.count
##   1   2   3 
##   0  74 105
node_clique_reduced <- node_clique$clique.comemb
node_clique_reduced <- node_clique_reduced[which(rowSums(node_clique_reduced) > 0), which(colSums(node_clique_reduced) > 0)]

gplot(node_clique_reduced,
      label.cex = 0.5, 
      vertex.cex = 0.75,
      displaylabels = TRUE,
      edge.col = "grey")

The largest group of nodes is:

vcol <- rep("grey80", vcount(union_graph_undir))
vcol[unlist(largest_cliques(union_graph_undir))] <- "red"

plot(union_graph_undir,
     layout = layout,
     vertex.label = gsub(" ", "\n", V(union_graph_undir)$name),
     vertex.shape = V(union_graph_undir)$shape,
     vertex.color = vcol, 
     vertex.size = 5, 
     vertex.frame.color = "gray", 
     vertex.label.color = "black", 
     vertex.label.cex = 0.8,
     edge.width = 2,
     edge.arrow.size = 0.5,
     edge.color = E(union_graph_undir)$color,
     edge.lty = E(union_graph_undir)$lty)


The number of connected components can again be calculated with both, igraph and **sna.

# is our network strongly or weakly connected?
igraph::is.connected(union_graph_undir, mode = "strong")
## [1] FALSE
igraph::is.connected(union_graph_undir, mode = "weak")
## [1] FALSE
igraph::count_components(union_graph_undir, mode = "strong")
## [1] 2
comp <- igraph::components(union_graph_undir, mode = "strong")
head(groups(comp))
## $`1`
##   [1] "Alys Arryn"                           
##   [2] "Elys Waynwood"                        
##   [3] "Jasper Arryn"                         
##   [4] "Jeyne Royce"                          
##   [5] "Jon Arryn"                            
##   [6] "Lysa Arryn"                           
##   [7] "Robert Arryn"                         
##   [8] "Rowena Arryn"                         
##   [9] "Cassana Baratheon"                    
##  [10] "Cersei Lannister"                     
##  [11] "Jaime Lannister"                      
##  [12] "Joffrey Baratheon"                    
##  [13] "Margaery Tyrell"                      
##  [14] "Myrcella Baratheon"                   
##  [15] "Ormund Baratheon"                     
##  [16] "Renly Baratheon"                      
##  [17] "Rhaelle Targaryen"                    
##  [18] "Robert Baratheon"                     
##  [19] "Selyse Florent"                       
##  [20] "Shireen Baratheon"                    
##  [21] "Stannis Baratheon"                    
##  [22] "Steffon Baratheon"                    
##  [23] "Tommen Baratheon"                     
##  [24] "Amarei Crakehall"                     
##  [25] "Amerei Frey"                          
##  [26] "Benfrey Frey"                         
##  [27] "Bethany Rosby"                        
##  [28] "Cleos Frey"                           
##  [29] "Dorna Swyft"                          
##  [30] "Edmure Tully"                         
##  [31] "Emmon Frey"                           
##  [32] "Genna Lannister"                      
##  [33] "Jeyne Darry"                          
##  [34] "Jyanna Frey"                          
##  [35] "Kevan Lannister"                      
##  [36] "Lancel Lannister"                     
##  [37] "Lyonel Frey"                          
##  [38] "Marissa Frey"                         
##  [39] "Mariya Darry"                         
##  [40] "Melesa Crakehall"                     
##  [41] "Merrett Frey"                         
##  [42] "Olyvar Frey"                          
##  [43] "Pate of the Blue Fork"                
##  [44] "Perra Royce"                          
##  [45] "Perwyn Frey"                          
##  [46] "Roose Bolton"                         
##  [47] "Roslin Frey"                          
##  [48] "Tion Frey"                            
##  [49] "Tywin Frey"                           
##  [50] "Walda Frey daughter of Merrett"       
##  [51] "Walder Frey"                          
##  [52] "Walder Frey son of Emmon"             
##  [53] "Walder Frey son of Merrett"           
##  [54] "Willamen Frey"                        
##  [55] "Willem Frey"                          
##  [56] "Alys Stackspear"                      
##  [57] "Alysanne Farman"                      
##  [58] "Antario Jast"                         
##  [59] "Cerenna Lannister"                    
##  [60] "Cerissa Brax"                         
##  [61] "Damion Lannister"                     
##  [62] "Damon Lannister"                      
##  [63] "Damon Lannister son of Jason"         
##  [64] "Darlessa Marbrand"                    
##  [65] "Daven Lannister"                      
##  [66] "Ella Lannister"                       
##  [67] "Ermesande Hayford"                    
##  [68] "Gerion Lannister"                     
##  [69] "Gerold Lannister"                     
##  [70] "Jason Lannister"                      
##  [71] "Jeyne Marbrand"                       
##  [72] "Joanna Lannister"                     
##  [73] "Lanna Lannister"                      
##  [74] "Lucion Lannister"                     
##  [75] "Marla Prester"                        
##  [76] "Myranda Lefford"                      
##  [77] "Myrielle Lannister"                   
##  [78] "Rohanne Webber"                       
##  [79] "Sansa Stark"                          
##  [80] "Shiera Crakehall"                     
##  [81] "Stafford Lannister"                   
##  [82] "Teora Kyndall"                        
##  [83] "Tybolt Lannister"                     
##  [84] "Tygett Lannister"                     
##  [85] "Tyrek Lannister"                      
##  [86] "Tyrion Lannister"                     
##  [87] "Tysha"                                
##  [88] "Tytos Lannister"                      
##  [89] "Tywin Lannister"                      
##  [90] "Willem Lannister"                     
##  [91] "Aegon Targaryen son of Rhaegar"       
##  [92] "Doran Martell"                        
##  [93] "Dorea Sand"                           
##  [94] "Elia Martell"                         
##  [95] "Elia Sand"                            
##  [96] "Ellaria Sand"                         
##  [97] "Loreza Sand"                          
##  [98] "Mellario of Norvos"                   
##  [99] "Mors Martell brother of Doran"        
## [100] "Nymeria Sand"                         
## [101] "Obara Sand"                           
## [102] "Obella Sand"                          
## [103] "Oberyn Martell"                       
## [104] "Princess of Dorne"                    
## [105] "Rhaegar Targaryen"                    
## [106] "Rhaenys Targaryen daughter of Rhaegar"
## [107] "Sarella Sand"                         
## [108] "Tyene Sand"                           
## [109] "Alys Karstark"                        
## [110] "Arra Norrey"                          
## [111] "Artos Stark"                          
## [112] "Arya Flint"                           
## [113] "Arya Stark"                           
## [114] "Benjen Stark"                         
## [115] "Beron Stark"                          
## [116] "Bran Stark"                           
## [117] "Brandon Stark"                        
## [118] "Brandon Stark Burner"                 
## [119] "Brandon Stark son of Cregan"          
## [120] "Catelyn Stark"                        
## [121] "Cregan Stark"                         
## [122] "Eddard Stark"                         
## [123] "Edwyle Stark"                         
## [124] "Gilliane Glover"                      
## [125] "Jeyne Westerling"                     
## [126] "Jon Snow"                             
## [127] "Jonnel Stark"                         
## [128] "Lorra Royce"                          
## [129] "Lyanna Glover"                        
## [130] "Lyanna Stark"                         
## [131] "Lyarra Stark"                         
## [132] "Lynara Stark"                         
## [133] "Lysara Karstark"                      
## [134] "Marna Locke"                          
## [135] "Melantha Blackwood"                   
## [136] "Myriame Manderly"                     
## [137] "Ramsay Bolton"                        
## [138] "Rickard Stark"                        
## [139] "Rickon Stark"                         
## [140] "Rickon Stark son of Benjen"           
## [141] "Robb Stark"                           
## [142] "Robyn Ryswell"                        
## [143] "Rodrik Stark"                         
## [144] "Rodrik Stark son of Beron"            
## [145] "Rodwell Stark"                        
## [146] "Talisa Stark"                         
## [147] "Willam Stark"                         
## [148] "Aegon V Targaryen"                    
## [149] "Aerys II Targaryen"                   
## [150] "Betha Blackwood"                      
## [151] "Daenerys Targaryen"                   
## [152] "Drogo"                                
## [153] "Duncan Targaryen"                     
## [154] "Dyanna Dayne"                         
## [155] "Hizdahr zo Loraq"                     
## [156] "Jaehaerys II Targaryen"               
## [157] "Jenny of Oldstones"                   
## [158] "Maekar I Targaryen"                   
## [159] "Maron Martell"                        
## [160] "Rhaella Targaryen"                    
## [161] "Shaera Targaryen"                     
## [162] "Viserys Targaryen"                    
## [163] "Hoster Tully"                         
## [164] "Minisa Whent"                         
## [165] "Petyr Baelish"                        
## [166] "Alerie Hightower"                     
## [167] "Elinor Tyrell"                        
## [168] "Elyn Norridge"                        
## [169] "Garlan Tyrell"                        
## [170] "Hobber Redwyne"                       
## [171] "Horas Redwyne"                        
## [172] "Leo Blackbar"                         
## [173] "Leonette Fossoway"                    
## [174] "Lia Serry"                            
## [175] "Loras Tyrell"                         
## [176] "Luthor Tyrell"                        
## [177] "Luthor Tyrell son of Moryn"           
## [178] "Luthor Tyrell son of Theodore"        
## [179] "Mace Tyrell"                          
## [180] "Medwick Tyrell"                       
## [181] "Mina Tyrell"                          
## [182] "Moryn Tyrell"                         
## [183] "Olene Tyrell"                         
## [184] "Olenna Redwyne"                       
## [185] "Paxter Redwyne"                       
## [186] "Theodore Tyrell"                      
## [187] "Unknown father Tyrell"                
## [188] "Unknown mother Tyrell"                
## [189] "Willas Tyrell"                        
## 
## $`2`
##  [1] "Aeron Greyjoy"           "Alannys Harlaw"         
##  [3] "Asha (Yara) Greyjoy"     "Balon Greyjoy"          
##  [5] "Donel Greyjoy"           "Erik Ironmaker"         
##  [7] "Euron Greyjoy"           "Harlon Greyjoy"         
##  [9] "Lady of House Piper"     "Lady of House Stonetree"
## [11] "Lady of House Sunderly"  "Maron Greyjoy"          
## [13] "Quellon Greyjoy"         "Quenton Greyjoy"        
## [15] "Robin Greyjoy"           "Rodrik Greyjoy"         
## [17] "Theon Greyjoy"           "Urrigon Greyjoy"        
## [19] "Victarion Greyjoy"
comp_dist <- igraph::component_distribution(union_graph_undir, mode = "strong")
comp_dist
##   [1] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
##  [18] 0.0 0.0 0.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
##  [35] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
##  [52] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
##  [69] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
##  [86] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
## [103] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
## [120] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
## [137] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
## [154] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
## [171] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
## [188] 0.0 0.0 0.5
#sna::components(adjacency)


Clustering

ceb <- cluster_edge_betweenness(union_graph_undir)
modularity(ceb)
## [1] 0.8359884
plot(ceb,
     union_graph_undir,
     layout = layout,
     vertex.label = gsub(" ", "\n", V(union_graph_undir)$name),
     vertex.shape = V(union_graph_undir)$shape,
     vertex.size = (V(union_graph_undir)$popularity + 0.5) * 5, 
     vertex.frame.color = "gray", 
     vertex.label.color = "black", 
     vertex.label.cex = 0.8)

clp <- cluster_label_prop(union_graph_undir)

plot(clp,
     union_graph_undir,
     layout = layout,
     vertex.label = gsub(" ", "\n", V(union_graph_undir)$name),
     vertex.shape = V(union_graph_undir)$shape,
     vertex.size = (V(union_graph_undir)$popularity + 0.5) * 5, 
     vertex.frame.color = "gray", 
     vertex.label.color = "black", 
     vertex.label.cex = 0.8)

“equiv.clust uses a definition of approximate equivalence (equiv.fun) to form a hierarchical clustering of network positions. Where dat consists of multiple relations, all specified relations are considered jointly in forming the equivalence clustering.” equiv.clust() help

ec <- equiv.clust(adj_network, mode = "graph", cluster.method = "average", plabels = network.vertex.names(adj_network))
ec # The clustering
## Position Clustering:
## 
##  Equivalence function: sedist 
##  Equivalence metric: hamming 
##  Cluster method: average 
##  Graph order: 208
ec$cluster$labels <- ec$plabels
plot(ec) # Plot the dendrogram

“Given a set of equivalence classes (in the form of an equiv.clust object, hclust object, or membership vector) and one or more graphs, blockmodel will form a blockmodel of the input graph(s) based on the classes in question, using the specified block content type. […] Unless a vector of classes is specified, blockmodel forms its eponymous models by using cutree to cut an equivalence clustering in the fashion specified by k and h. After forming clusters (roles), the input graphs are reordered and blockmodel reduction is applied.” blockmodel() help

bm <- blockmodel(adj_network, ec, h = 5, plabels = network.vertex.names(adj_network))
bm_bm <- bm$block.model
bm_bm[is.nan(bm_bm)] <- 1

gplot(bm_bm, 
      diag = TRUE, 
      edge.lwd = bm_bm * 5,
      vertex.cex = sqrt(table(bm$block.membership)) / 2,
      gmode = "graph", 
      vertex.sides = 50, 
      vertex.col = gray(1 - diag(bm_bm)))



sessionInfo()
## R version 3.4.0 (2017-04-21)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 7 x64 (build 7601) Service Pack 1
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] NetIndices_1.4.4     MASS_7.3-47          statnet_2016.9      
##  [4] sna_2.4              ergm.count_3.2.2     tergm_3.4.0         
##  [7] networkDynamic_0.9.0 ergm_3.7.1           network_1.13.0      
## [10] statnet.common_3.3.0 igraph_1.0.1         dplyr_0.5.0         
## [13] purrr_0.2.2          readr_1.1.0          tidyr_0.6.2         
## [16] tibble_1.3.0         ggplot2_2.2.1        tidyverse_1.1.1     
## 
## loaded via a namespace (and not attached):
##  [1] lpSolve_5.6.13    reshape2_1.4.2    haven_1.0.0      
##  [4] lattice_0.20-35   colorspace_1.3-2  htmltools_0.3.6  
##  [7] yaml_2.1.14       foreign_0.8-68    DBI_0.6-1        
## [10] modelr_0.1.0      readxl_1.0.0      trust_0.1-7      
## [13] plyr_1.8.4        robustbase_0.92-7 stringr_1.2.0    
## [16] munsell_0.4.3     gtable_0.2.0      cellranger_1.1.0 
## [19] rvest_0.3.2       coda_0.19-1       psych_1.7.5      
## [22] evaluate_0.10     labeling_0.3      knitr_1.15.1     
## [25] forcats_0.2.0     parallel_3.4.0    DEoptimR_1.0-8   
## [28] broom_0.4.2       Rcpp_0.12.10      scales_0.4.1     
## [31] backports_1.0.5   jsonlite_1.4      mnormt_1.5-5     
## [34] hms_0.3           digest_0.6.12     stringi_1.1.5    
## [37] grid_3.4.0        rprojroot_1.2     tools_3.4.0      
## [40] magrittr_1.5      lazyeval_0.2.0    Matrix_1.2-10    
## [43] xml2_1.1.1        lubridate_1.6.0   assertthat_0.2.0 
## [46] rmarkdown_1.5     httr_1.2.1        R6_2.2.0         
## [49] nlme_3.1-131      compiler_3.4.0